-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update solr-ocrhighlighting and make SOLR_HOCR_PLUGIN_PATH available to php-fpm #345
Conversation
@@ -436,3 +436,5 @@ clear_env = yes | |||
;php_admin_value[error_log] = /var/log/php83/$pool.error.log | |||
;php_admin_flag[log_errors] = on | |||
;php_admin_value[memory_limit] = 32M | |||
|
|||
env['SOLR_HOCR_PLUGIN_PATH'] = "{{ getenv "SOLR_HOCR_PLUGIN_PATH" }}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be removed, see the comment on nginx/Dockerfile
.
@@ -89,6 +89,7 @@ ENV \ | |||
PHP_POST_MAX_SIZE=128M \ | |||
PHP_PROCESS_CONTROL_TIMEOUT=60 \ | |||
PHP_REQUEST_TERMINATE_TIMEOUT=60 \ | |||
PHP_UPLOAD_MAX_FILESIZE=128M | |||
PHP_UPLOAD_MAX_FILESIZE=128M \ | |||
SOLR_HOCR_PLUGIN_PATH=/opt/solr/server/solr/contrib/ocrhighlighting/lib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the nginx
image is used by quite a few down stream images, I think this should go in the Drupal image, since it is the only one to make use of this environment variable.
It's a shame that this isn't part of the site configuration, as is typical. This makes it work differently than every other environment variable used to configure Drupal modules.
Normally, it should go here, and follow the existing conventions for supporting multi-sites, i.e. DRUPAL_DEFAULT_ SOLR_HOCR_PLUGIN_PATH
. Then it would be used to configure each site.
isle-buildkit/drupal/Dockerfile
Lines 28 to 66 in afaf688
ENV \ | |
[email protected] \ | |
DRUPAL_DEFAULT_ACCOUNT_NAME=admin \ | |
DRUPAL_DEFAULT_ACCOUNT_PASSWORD=password \ | |
DRUPAL_DEFAULT_BROKER_HOST=activemq \ | |
DRUPAL_DEFAULT_BROKER_PORT=61613 \ | |
DRUPAL_DEFAULT_BROKER_URL=tcp://activemq:61613 \ | |
DRUPAL_DEFAULT_BROKER_WEB_ADMIN_PASSWORD=password \ | |
DRUPAL_DEFAULT_BROKER_WEB_ADMIN_USER=admin \ | |
DRUPAL_DEFAULT_BROKER_WEB_PORT=8161 \ | |
DRUPAL_DEFAULT_CANTALOUPE_URL=https://islandora.traefik.me/cantaloupe/iiif/2 \ | |
DRUPAL_DEFAULT_CONFIGDIR=/var/www/drupal/config/sync \ | |
DRUPAL_DEFAULT_DB_NAME=drupal_default \ | |
DRUPAL_DEFAULT_DB_PASSWORD=password \ | |
DRUPAL_DEFAULT_DB_USER=drupal_default \ | |
[email protected] \ | |
DRUPAL_DEFAULT_FCREPO_HOST=islandora.traefik.me \ | |
DRUPAL_DEFAULT_FCREPO_PORT=8081 \ | |
DRUPAL_DEFAULT_FITS_HOST=fits \ | |
DRUPAL_DEFAULT_FITS_PORT=8080 \ | |
DRUPAL_DEFAULT_INSTALL_EXISTING_CONFIG=false \ | |
DRUPAL_DEFAULT_INSTALL=true \ | |
DRUPAL_DEFAULT_LOCALE=en \ | |
DRUPAL_DEFAULT_MATOMO_URL_HTTP=http://islandora.traefik.me/matomo/ \ | |
DRUPAL_DEFAULT_MATOMO_URL_HTTPS=https://islandora.traefik.me/matomo/ \ | |
DRUPAL_DEFAULT_NAME=Default \ | |
DRUPAL_DEFAULT_PROFILE=standard \ | |
DRUPAL_DEFAULT_SALT=9PPaL0CxZAIcq0l9wxgDGlCZrp7JdT_x7v9gVzpdbUjMt1PqDz3uD0Zy-i16DuJ1-Htuq5hqeg \ | |
DRUPAL_DEFAULT_SITE_URL=https://islandora.traefik.me \ | |
DRUPAL_DEFAULT_SOLR_CORE=ISLANDORA \ | |
DRUPAL_DEFAULT_SOLR_HOST=solr \ | |
DRUPAL_DEFAULT_SOLR_PORT=8983 \ | |
DRUPAL_DEFAULT_SUBDIR=default \ | |
DRUPAL_DEFAULT_TRIPLESTORE_HOST=blazegraph \ | |
DRUPAL_DEFAULT_TRIPLESTORE_NAMESPACE=islandora \ | |
DRUPAL_DEFAULT_TRIPLESTORE_PORT=8080 \ | |
DRUPAL_ENABLE_HTTPS=true \ | |
DRUPAL_REVERSE_PROXY_IPS= \ | |
DRUPAL_SITES=DEFAULT |
But that won't work in this case, as it needs to be exposed to the php-fpm as an environment variable rather than a Drupal configuration override.
So what we should do is create a file drupal/rootfs/etc/php83/php-fpm.d/solr.conf
, that contains:
# Configuration for https://github.com/discoverygarden/islandora_hocr
env['SOLR_HOCR_PLUGIN_PATH'] = "/opt/solr/server/solr/contrib/ocrhighlighting/lib"
Using the full path rather than a docker image environment variable is fine since it's not configurable in the solr
image either, so it does not need to vary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The env var needs to be available in php-fpm, not nginx. While it's not ideal, putting it in the base nginx/php container is the easiest to get it available to Drupal. Otherwise we need to override the entire www.conf for drupal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joecorall sorry I misunderstood initially, I've updated the comment with a solution for php-fpm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, if we could just have the Solr image always load the plugin, regardless of the solrconfig_extra.xml
that would probably be ideal, since this is the end goal anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will that add that directive to the www
php-fpm pool? I'm trying to read the docs but they're a little sparse around this topic. I'll just try it locally and will make the change if it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should do, this is what I see inside the container.
1d8b54aff6fb:/etc/php83# rg php-fpm.d php-fpm.conf
143:include=/etc/php83/php-fpm.d/*.conf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, if we could just have the Solr image always load the plugin, regardless of the solrconfig_extra.xml that would probably be ideal, since this is the end goal anyway.
Those solr config files are created dynamically by Drupal, based on the search_api_solr module version. It makes managing this pretty difficult, and why we're stuck with this sort of hacky way to get the proper CONF set in solr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should do, this is what I see inside the container.
But php-fpm confs define pools, and the env vars need to go into the pool. Our pool is set in www.conf
. Adding the additional file with env
directive gives
2024-07-30 11:27:55 [30-Jul-2024 15:27:55] ERROR: [/etc/php83/php-fpm.d/solr.conf:1] Array are not allowed in the global section
2024-07-30 11:27:55 [30-Jul-2024 15:27:55] ERROR: Unable to include /etc/php83/php-fpm.d/solr.conf from /etc/php83/php-fpm.conf at line 1
2024-07-30 11:27:55 [30-Jul-2024 15:27:55] ERROR: failed to load configuration file '/etc/php83/php-fpm.conf'
2024-07-30 11:27:55 [30-Jul-2024 15:27:55] ERROR: FPM initialization failed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I was wrong. Sorry for wasting your time. I'm going to merge this even though it's not passing because we're still experiencing terrible rate-limiting problems. I'll keep an eye on the release and re-run the build in 6+ hours when our rate limit resets. After mid-August, we should no longer have to deal with the rate limits.
Addendum to #338
First, bump solr-ocrhighlighting to latest version
Then, in order to add the necessary solr configuration for hOCR highlighting, the SOLR_HOCR_PLUGIN_PATH environment variable needs to be set and available in php-fpm. Without passing this environment variable into php-fpm's www conf, the php-fpm process is not able to read the environment variable resulting in a
solrconfig_extra.xml
withBy passing the environment variable into the php-fpm process, downloading the solr config XML through Drupal's UI results in
Related links